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TITLE OF THE INVENTION 

VIDEO ENCODING METHOD, VIDEO DECODING METHOD, VIDEO ENCODING 
PROGRAM, VIDEO DECODING PROGRAM, VIDEO ENCODING APPARATUS, 
AND VIDEO DECODING APPARATUS 
5 BACKGROUND OF THE INVENTION 

Field of the Invention 
[0001] 

The present invention relates to compression encoding 
and decoding of moving pictures and, more particularly, to 
10 a method of efficiently transmitting encoding conditions. 

Related Background Art 
[0002] 

Conventionally, compression encoding techniques of 
moving picture signals are used for transmission and 

15 storage-regeneration of moving picture signals. The 

well-lcnown techniques include, for example, the 
international standard video coding methods such as ITU-T 
Recommendation H.2 63, ISO/IEC International Standard 
14496-2 (MPEG-4 Visual), and so on. 

20 [0003] 

Another Icnown newer encoding system is a video coding 
method scheduled for joint international standardization 
by ITU-T and ISO/IEC; ITU-T Recommendation H. 2 64 and ISO/IEC 
International Standard 14496-10. The general encoding 
25 techniques used in these video coding methods are disclosed, 

for example, in nonpatent Document 1 presented below. 
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[0004] 

[Nonpatent Document 1] 

Basic Technologies on International Image Coding 
Standards 

5 (co-authored by Fumitaka Ono and Hiroshi Watanabe and 

published March 20, 1998 by CORONA PUBLISHING CO., LTD.) 
SUMMARY OF THE INVENTION 

[0005] 

In these encoding methods, an encoding apparatus is 
10 , configured to partition an image into multiple regions and 
perform an encoding operation thereof under the same 
conditions for each of the regions. The encoding apparatus 
groups pixel values included in each region, into a plurality 
of encoding units, thereafter obtains residuals from 
15 predetermined predictive signals, and then performs Discrete 

Cosine Transform (DCT) of the difference signals, 
quantization of coefficients of DCT, and variable-length 
encoding of quantized data. This results in generating 
compression-encoded data (bitstream) . 
20 [0006] 

Sizes of encoding units differ depending upon image 
encoding conditions (hereinafter referred to as "encoding 
modes") . Fig. 1 is a diagram showing relations between image 
encoding modes and encoding units . One of the encoding modes 
25 is a mode called a frame encoding mode of performing encoding 

without separating scan lines of an image (hereinafter 

2 



FP03-0159--00 

referred to as "frame mode") . Numeral 802 in Fig. 1 denotes 
this frame mode . An encoding unit in this case is a macroblock 
consisting of 16x16 pixels. 
[0007] 

5 In contrast to it, an encoding mode of performing 

encoding with separating scan lines of an image is called 
a field encoding mode (803 in Fig. 1, which will be referred 
to hereinafter as "field mode") . Numeral 804 in Fig. 1 
designates a case wherein scan lines of an interlaced image 
10 are separated into even scan lines and odd scan lines. 

Encoding units in this case are macroblock units, similarly 
as in the case of the frame encoding, but an encoding unit 
after merging of the scan lines is 16x32 pixels. 
[0008] 

15 Furthermore, there are a mode of performing encoding 

with separating scan lines in encoding units and a mode of 
performing encoding without separating scan lines in encoding 
units. Numeral 805 in Fig. 1 represents a case in which 
encoding is performed without separating scan lines in 

20 encoding units . Encoding units in this case are macroblocks . 

In the case where the scan lines are adaptively separated 
or not separated in encoding units (hereinafter referred 
to as "MB__AFF mode" ) , as indicated by 80 6 in Fig. 1, encoding 
units are represented by "macroblock pairs" each consisting 

25 of 16x32 pixels. As described above, the encoding apparatus 

changes the sizes of encoding units according to the encoding 
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modes to achieve an optimal structure, thereby performing 

efficient compression encoding. 

[0009] 

On the other hand, in partitioning of an image into 
5 multiple regions, the encoding apparatus is also configured 

to define the regions in most efficient encoding units in 
the encoding mode. Fig. 2(a) and Fig. 2(b) are diagrams 
showing examples of regions in images partitioned in prior 
art. The image 901 of Fig. 2(a) is partitioned into two 

10 regions; one being a region filled with the same pattern 

as block 902 and the other an unfilled region. The frame 
mode is assumed herein and region 903 is defined in macroblock 
units in an order as indicated by dashed arrow 904 from the 
center of the image. The image 905 of Fig. 2(b) is also 

15 partitioned into two regions, one being a region filled with 

the same pattern as block 90 6 and the other an unfilled region. 
The MB_AFF mode is assiimed herein and region 907 is defined 
in units of "macroblock pairs" in an order indicated by dashed 
arrow 908 from the center of the image. 

20 [0010] 

Compression-encoded data encoded in the encoding units 
is put together on a region-by-region basis and related 
information such as the encoding mode and others is attached 
thereto, followed by transmission or recording thereof. By 
25 putting the compression-encoded data together on a 

region-by-region basis, we can enjoy the advantage that even 
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if there occurs an error because of contamination of data 
in a certain region the spread of the error to the other 
region can be suppressed. It is also feasible to perform 
parallel processing in region units and thus enables fast 
5 operation. 
[0011] 

However, the above prior art has a problem as described 
below. Namely, it is required that, in the video encoding 
method of partitioning an image into multiple regions, the 

10 regions of temporally adjacent images be consistent with 

each other. In the prior art, however, the regions are 
defined on the basis of the encoding units and the encoding 
units are thus different depending on the encoding modes. 
For this reason, in the case where the encoding modes of 

15 adjacent images are different from each other, patterns of 

regions will be different even when defined under the same 
conditions . 
[0012] 

For example, supposing the image 901 and the image 905 
20 are two temporally adjacent images, patterns of region 903 

and region 907 are different because of the difference between 
the encoding modes of the respective images. In such 
inconsistent cases, corresponding regions will change their 
shape with time, and when the images in the regions are 
25 displayed on the time axis, it will be heavily obstructive 

to human' s perception.. 
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[0013] 

Furthermore, by observing rectangle 909 in Fig. 2(a) 
and rectangle 910 in Fig, 2 (b) , it can be seen that the lower 
half block of rectangle 910 belongs to the other region 
5 (unfilled region) in the image 901, Namely, a block 

corresponding to the lower half block of rectangle 910 is 
absent before the unfilled region of image 901 is reproduced. 
Therefore, the pertinent block is not used in predictive 
coding and thus adversely affects the efficiency of 
10 compression encoding. 

[0014] 

An object of the present invention is, therefore, to 
reduce the change of the region shape due to the difference 
of encoding modes in encoding and decoding of moving pictures 
15 and increase the efficiency of compression encoding. 

[0015] 

In order to solve the above problem, a video encoding 
method according to the present invention is a video encoding 
method for video encoding apparatus to encode a moving picture 

20 partitioned into a plurality of regions, the video encoding 

method comprising: a step of determining an encoding mode 
of each image in encoding a moving picture consisting of 
a plurality of images; a step of determining a region 
structural unit for partitioning the image into multiple 

25 regions, based on the encoding mode; a step of defining the 

regions on the basis of the region structural unit; a step 
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of encoding region information about the regions thus defined 
(e.g., information about shapes of the regions); a step of 
compression-encoding pixel data included in the regions, 
in encoding units to generate compression-encoded data 
5 according to the encoding mode; and an output step of 

outputting the encoding mode, the region information, and 

the compression-encoded data. 

[0016] 

In the video encoding method according to the present 
10 invention, the encoding mode may be one selected from: a 

frame mode of performing encoding without separating scan 
lines constituting an image; a field mode of performing 
encoding with separating scan lines constituting an image; 
an encoding-unit-switching mode of dividing an image into 
15 a plurality of encoding units and performing encoding in 

each encoding unit by either the frame mode or the field 
mode; an image-unit-switching mode of performing encoding 
in each image unit by either the frame mode or the field 
mode; a first combination mode as a combination of the frame 
2 0 mode with the encoding-unit-switching mode; and a second 

combination mode as a combination of the field mode with 
the image-unit-switching mode. 
[0017] 

In the video encoding method according to the present 
25 invention, each of the encoding units can be: a bloclc 

consisting of NxN pixels when the encoding mode is the frame 
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mode; a block consisting of NxN pixels when the encoding 
mode is the field mode; or a block consisting of NxM (M is 
anumber of vertical pixels, andM=2N) pixels when the encoding 
mode is the encoding-unit-switching mode. 
5 [0018] 

The video encoding method according to the present 
invention may be configured so that when all the images 
constituting the moving picture are encoded in one encoding 
mode, the region structural unit is the encoding unit, and 
10 so that when the images constituting the moving picture are 

encoded each in different encoding modes, the region 
structural unit is a largest encoding unit out of the encoding 
units of the different encoding modes. 
[0019] 

15 A video decoding method according to the present 

invention is a video decoding method for video decoding 
apparatus to decode a moving picture partitioned into a 
plurality of regions, the video decoding method comprising: 
a step of effecting input of compression-encoded data 

20 generated from each of images constituting a moving picture, 

by partitioning the image into multiple regions and 
implementing compression encoding thereof; a step of 
specifying an encoding mode of each image from the 
compression-encoded data; a step of determining a region 

25 structural unit for partitioning the image into multiple 

regions, based on the encoding mode; a step of acquiring 
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region information about the regions (e.g., information about 
shapes of the regions) from the compression-encoded data; 
a step of defining the regions, based on the region structural 
unit and the region information; a step of decoding the 
5 compression-encoded data included in the regions thus 

defined, in encoding units to generate regenerated data in 
encoding units; and a step of constructing a regenerated 
image from the regenerated data in encoding units in 
accordance with the encoding mode. 
10 [0020] 

In the video decoding method according to the present 
invention, the encoding mode may be one selected from: a 
frame mode of performing encoding without separating scan 
lines constituting an image; a field mode of performing 

15 encoding with separating scan lines constituting an image; 

an encoding-unit-switching mode of dividing an image into 
a plurality of encoding units and performing encoding in 
each encoding unit by either the frame mode or the field 
mode; an image-unit-switching mode of performing encoding 

20 in each image unit by either the frame mode or the field 

mode; a first combination mode as a combination of the frame 
mode with the encoding-unit-switching mode; and a second 
combination mode as a combination of the field mode with 
the image-unit-switching mode. 

25 [0021] 

In the video decoding method according to the present 
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invention, each of the encoding units can be: a block 
consisting of NxN pixels when the encoding mode is the frame 
mode; a block consisting of NxN pixels when the encoding 
mode is the field mode; or a block consisting of NxM (M is 
5 a number of vertical pixels, andM=2N) pixels when the encoding 

mode is the encoding-unit-switching mode. 
[0022] 

The video decoding method according to the present 
invention may be configured so that when all the images 

10 constituting the moving picture are encoded in one encoding 

mode, the region structural unit is the encoding unit, and 
so that when the images constituting the moving picture are 
encoded each in different encoding modes, the region 
structural unit is a largest encoding unit out of the encoding 

15 units of the different encoding modes. 

[0023] 

A video encoding program according to the present 
invention is configured to let a computer execute processing 
associated with the above-stated, video encoding method. 
2 0 A video decoding program according to the present 

invention is configured to let a computer execute processing 
associated with the above-stated video decoding method. 
[0024] 

A video encoding apparatus according to the present 
25 invention is a video encoding apparatus for encoding a moving 

picture partitioned into a plurality of regions, the video 
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encoding apparatus comprising: encoding mode determining 
means for determining an encoding mode of each image in 
encoding the moving picture consisting of a plurality of 
images; region structural unit determining means for 
5 determining a region structural unit for partitioning the 

image into multiple regions, based on the encoding mode; 
region defining means for defining the regions on the basis 
of the region structural unit; region information encoding 
means for encoding region information about the regions thus 
10 defined; and data generating means for compression-encoding 

pixel data included in the regions, in encoding units to 
generate compression-encoded data according to the encoding 
mode . 
[0025] 

15 A video decoding apparatus according to the present 

invention is a video decoding apparatus for decoding a moving 
picture partitioned into a plurality of regions, the video 
decoding apparatus comprising: data input means for effecting 
input of compression-encoded data generated from each of 

20 images constituting a moving picture, by partitioning the 

image into multiple regions and implementing compression 
encoding thereof; encoding mode determining means for 
determining an encoding mode of each image from the 
compression-encoded data; region structural unit 

25 determining means for determining a region structural unit 

for partitioning the image into multiple regions, based on. 
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the encoding mode; region information acquiring means for 
acquiring region information about the regions from the 
compression-encoded data; region definingmeans for defining 
the regions, based on the region structural unit and the 
5 region information; regenerated data generating means for 

decoding the compression-encoded data included in the regions 
thus defined, in encoding units to generate regenerated data 
in encoding units; and regenerated image constructing means 
for constructing a regenerated image from the regenerated 
10 data in encoding units in accordance with the encoding mode. 

[0026] 

The video encoding method according to the present 
invention may also be configured so that, for all the images 
included in the moving picture, the region structural unit 

15 is a block consisting of NxN pixels, in a frame mode of 

performing encoding without separating scan lines 
constituting each image, the region structural unit is a 
block consisting of NxN pixels, in a field mode of performing 
encoding with separating scan lines constituting each image, 

20 the region structural unit is a block consisting of NxM (M 

is a number of vertical pixels, and M=2N) pixels, in an 
encoding-unit-switching mode of dividing each image into 
a plurality of encoding units and performing encoding in 
each encoding unit by either the frame mode or the field 

25 mode, or the region structural unit is a block consisting 

of NxM (M is a number of vertical pixels, and M=2N) pixels. 
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in an image-unit-switching mode of performing encoding of 
each image in an image unit by either the frame mode or the 
field mode. 
[0027] 

5 The video decoding method according to the present 

invention may also be configured so that, for all the images 
included in the moving picture, the region structural unit 
is a block consisting of NxN pixels, in a frame mode of 
performing encoding without separating scan lines 

10 constituting each image, the region structural unit is a 

block consisting of NxN pixels, in a field mode of performing 
encoding with separating scan lines constituting each image, 
the region structural unit is a block consisting of NxM (M 
is anumber of verticalpixels, andM=2N) pixels, in an encoding 

15 -unit-switching mode of dividing each image into a plurality 

of encoding units and performing encoding in each encoding 
unit by either the frame mode or the field mode, or the region 
structural unit is a block consisting of NxM (M is a number 
of vertical pixels, and M=2N) pixels, in an 

20 image-unit-switching mode of performing encoding of each 

image in an image unit by either the frame mode or the field 
mode . 
[0028] 

A video encoding apparatus according to the present 
25 invention can also be configured to comprise input means 

for effecting input of a moving picture consisting of a 
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plurality of images; encoding mode controlling means for 
determining an encoding mode of each image in encoding the 
moving picture; region structural unit determining means 
for determining a region structural unit for partitioning 
5 each image into multiple regions, based on the encoding mode; 

region partitioning means for defining regions on the basis 
of the region structural unit and partitioning each image 
into multiple regions; encoding means for 

compression-encoding region information about the regions 
10 thus defined, information of the encoding mode, and pixel 

data included in the regions to generate compression-encoded 
data; and outputting means for outputting the 
compression-encoded data . 
[0029] 

15 A video decoding apparatus according to the present 

invention can also be configured to comprise input means 
for effecting input of compression-encoded data generated 
by partitioning each of images constituting a moving picture, 
into multiple regions and implementing compression encoding 

20 thereof; encoding mode specifying means for specifying an 

encoding mode of each image, based on the compression-encoded 
data; region structural unit determining means for 
determining a region structural unit for partitioning each 
image into multiple regions, based on the encoding mode; 

25 region defining means for acquiring region information about 

the regions, based on the compression-encoded data, and for 
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defining the regions, based on the region structural unit 
and the region information; and decoding means for decoding 
the compression-encoded data included in the regions thus 
defined, to construct a regenerated image in accordance with 
5 the encoding mode . 

[0030] 

According to these aspects of the invention, on the 
occasion of partitioning each constitutive image of a moving 
picture into regions in different encoding modes, a region 

10 structural unit is determined according to the combination 

of the encoding modes, regions are defined based thereon, 
and encoding or decoding of the moving picture is carried 
out based thereon. This permits consistent regions to be 
defined between adjacent images, whereby it becomes feasible 

15 to reduce the change of region shape due to the difference 

of encoding modes and increase the efficiency of compression 
encoding. 
[0031] 

The present invention will become more fully understood 
2 0 from the detailed description given herein below and the 

accompanying drawings which are given by way of illustration 
only, and thus are not to be considered as limiting the present 
invention. 

Further scope of applicability of ttie present invention 
25 will become apparent from the detailed description given 

hereinafter. However, it should be understood that the 
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detailed description and specific examples, while indicating 
preferred embodiments of the invention, are given by way 
of illustration only, since various changes andmodif ications 
within the spirit and scope of the invention will become 
5 apparent to those skilled in the art from this detailed 

description. 

BRIEF DESCRIPTION OF THE DRAWINGS 

Fig. 1 is an illustration conceptually showing the 
encoding units of images in the encoding modes in the prior 
10 art. 

Fig. 2(a) is a diagram schematically showing regions 
of an image partitioned in the frame mode according to the 
prior art. Fig. 2(b) is a diagram schematically showing 
regions of an image partitioned in the MB_AFF mode according 
15 to the prior art. 

Fig. 3 is a block diagram showing the schematic 
configuration of the video encoding apparatus according to 
the present invention. 

Fig. 4 is a flowchart showing the flow of the process 
20 of implementing the video encoding method according to the 

present invention . 

Fig. 5 is a flowchart showing the flow of the process 
for determining regions for encoding. 

Fig. 6(a) is a diagram schematically showing regions 
25 of an image partitioned on the basis of the video encoding 

method according to present invention, in the case where 
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the entire image is encoded in the frame mode. Fig. 6(b) 
is a diagram schematically showing regions of an image 
partitioned on the basis of the video encoding method, in 
the case where the entire image is encoded in the MB_AFF 
5 mode . 

Fig. 7 is a block diagram showing the schematic 
configuration of the video decoding apparatus according to 
the present invention. 

Fig. 8 is a flowchart showing the flow of the process 
10 of implementing the video decoding method according to the 

present invention . 

Fig. 9 is a diagram showing the configuration of the 
video processing program according to the present invention . 

Fig. 10(a) is a diagram showing a configuration example 
15 of the storage area for the video processing program. Fig. 

10(b) is a schematic diagram showing the appearance of a 
floppy disk as a recording medium. Fig. 10(c) is a schematic 
diagram showing a state in which the recording medium is 
mounted into a drive connected to a computer. 
2 0 DESCRIPTION OF THE PREFERRED EMBODIMENTS 

[0032] 
First Embodiment 

First, the first embodiment of the present invention 
will be described with reference to the accompanying 
25 drawings. 

Fig. 3 is a block diagram showing a configuration of 
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a video encoding apparatus for implementing the video 
encoding method according to the present invention . As shown 
in Fig. 3, the video encoding apparatus 100 is provided with 
first input terminal 101, encoding mode controller 102 in 
5 which an encoding mode is set, second input terminal 103, 

region partitioning device 104 for partitioning an image 
into multiple regions, and encoder 105. The encoding mode 
controller 102 has region determining unit 116. 
[0033] 

10 The operation of video encoding apparatus 100 of the 

above configuration and each of steps of the video encoding 
method implemented thereby will be described below. 

Conditions for encoding of an image are entered through 
input terminal 101 (S201 in Fig. 4) . Input means will differ 

15 depending upon application programs and conceivable means 

include, for example, a mode of entering a predetermined 
template according to a compression rate, a mode in which 
a user enters designated conditions through a keyboard, and 
so on. 

20 [0034] 

The aforementioned encoding modes include image 
encoding modes. The encoding modes are, for example, as 
follows . 

(1) A frame mode of performing encoding without 
25 separating scan lines constituting an image. 

(2) A f ieldmode of performing encoding with separating 
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even scan lines and odd scan lines constituting an image 
from each other. 

(3) An encoding-unit-switching mode of partitioning 
an image into multiple encoding units and performing encoding 

5 in encoding units by either the frame mode or the field mode 

(MB_AFF mode) . 

(4) An image-unit-switching mode of performing 
encoding in image units by either the frame mode or the field 
mode . 

10 (5) A first combination mode which combines (1) with 

(3) . 

(6) A second combination mode which combines (2) with 

(3) . 

[0035] 

15 According to these modes, the region determining unit 

116 determines regions for encoding (S202inFig. 4) . Details 
of the processing will be described later with Fig. 5. An 
image as a target for encoding is fed through the second 
input terminal 103, and is then partitioned into multiple 

20 regions (slices) according to the regions determined at S202, 

by the region partitioning unit 104. At the same time, the 
region partitioning unit 104 divides pixel values included 
in the regions, in encoding units (S203 in Fig. 4) . 
[0036] 

25 The encoding units differ according to the encoding 

modes . In the frame mode, the encoding units are macroblocJcs 
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each consisting of 16x16 pixels; in the field mode, the 
encoding units are macroblocks each consisting of 16x16 
pixels; in the encoding-unit-switching mode, the encoding 
units are "macroblock pairs" each consisting of 16x32 (where 
5 32 is the number of vertical pixels) pixels. The size of 

encoding units may be any size other than 16x16 and 16x32. 
[0037] 

The image having the pixel values divided in encoding 
units at S203 is fed to the encoder 105 and thereafter is 

10 compression-encoded in encoding units by motion compensation 

anddiscrete cosine transform (S204 inFig. 4) . Namely, ME /MC 
(: Motion Estimation/Motion Compensation) 114 detects a 
motion vector of the image, using a reference image stored 
in frame memory 113, and thereafter a difference is calculated 

15 from a motion-compensated predictive signal (108 in Fig. 

3). Furthermore, the difference signal is subjected to 
discrete cosine transform in DCT 109, thereafter the 
resultant data is quantized in Q (Quantization) 110, and 
then quantized data is subjected to variable-length coding 

20 in VLC (: Variable Length Coding) 115. This results in 

generating compression-encoded data. 
[0038] 

On the other hand, the quantized signal is subjected 
to inverse quantization and inverse discrete cosine transform 
25 inlQ+IDCT (: Inverse Quantization + Inverse Discrete Cosine 

Transform) 111, and thereafter the resultant is added to 



20 



FP03-0159-00 



the predictive signal 162 (112 in Fig. 3), thereby generating 
an image* The generated image is stored as a reference image 
into the frame memory 113. An image encoded in the frame 
mode is regenerated here and thereafter is stored into the 
5 frame memory 113 as it is. An image encoded in the field 

mode is regenerated here, and thereafter is stored into the 
frame memory 113 after merging of even scan lines and odd 
scan lines. An image encoded in the MB_AFF mode is 
regenerated here and thereafter is stored in the form of 
10 macroblock pairs into the frame memory 113. 

[0039] 

The compression-encoded data generated at S204 is fed 
to header information adding unit 106 (HDR in Fig. 3) and 
is combined with the encoding mode information including 

15 the image encoding mode and information about the shape of 

regions to obtain data in a predetermined format (S205 in 
Fig. 4) . Then the data is transmitted or recorded via the 
output terminal 107 (S206 in Fig. 4) . The information about 
the shape of regions herein is a rule for generating the 

20 regions, and examples thereof are the orders indicated by 

dashed arrow 404 shown in after-described Fig. 6(a) and 
indicated by dashed arrow 408 shown in after-described Fig. 
6(b) . 
[0040] 

25 Subsequently, the process of determining the regions 

for encoding at S202 of Fig. 4 will be described with reference 
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to Fig . 5 . Fig . 5 is a diagram showing the flow of the process 
for defining (or determining) the regions. When a signal 
is entered through input terminal 101 at S301, encoding mode 
information used in the entire moving picture is acquired 
5 from this input signal (S302) . The encoding mode is one of 

the aforementioned modes (1) to (6) . 
[0041] 

Next step S303 is to determine whether a single mode 
is applied to all the images constituting the moving picture . 

10 When the result of the determination is affirmative (S303; 

YES), the flow moves to S304. In this case, since all the 
images are encoded in identical encoding units (macroblocks 
in the frame mode or macroblock pairs in the MB_AFF mode) , 
the region structural unit (slice map unit) can be equal 

15 to an encoding unit. 

[0042] 

In contrast to it, where the images constituting the 
moving picture are encoded in mutually different modes, i.e., 
where the result of the above determination is negative (S303 ; 
20 NO), the flow moves to S305. In this case, the sizes of 

encoding units in the respective encoding modes are compared 
with each other, and a largest encoding unit among them is 
selected as a region structural unit. 
[0043] 

25 For example, where the frame mode and the MB_AFF mode 

are mixed as encoding modes, the encoding units in the 
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respective modes are macroblocks and macroblock pairs. 
Therefore, a macroblock pair being the largest is selected 
as a region structural unit. 
[0044] 

5 In the image-unit-switching mode of encoding each of 

the images constituting a video picture in image units by 
either the frame mode or the field mode, the region structural 
unit is 16 pixels (horizontal) x 32 pixels (vertical) . The 
reason for this is as follows. 
10 [0045] 

As describedpreviously, the encodingunits in the frame 
mode are 16x16. On the other hand, the encoding units in 
each field in the field mode are also 16x16, and thus encoding 
units after merging of two fields forming one frame are 

15 substantially 16x32. Therefore, according to the rule of 

defining the largest encoding unit as the region structural 
unit (the rule described at S305) , the region structural 
unit in the image-unit-switching mode is defined as 16x32 
in conformity with the field mode having the largest encoding 

20 units. 

[0046] 

The regions for partitioning of each image are defined 
on the basis of the region structural unit determined in 
this way (S306) , and they are outputted (S307) . Since each 
25 region is constructed on the basis of the region structural 

unit, the smallest region has the size equal to the region 
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structural . unit and a region smaller than it will never be 

defined. 

[0047] 

Fig. 6(a) and Fig. 6(b) are diagrams showing examples 
5 of regions in images partitioned by the video encoding method 

according to the present invention. Concerning image 401 
of Fig. 6(a), all images are assumed to be encoded in the 
frame mode, and the region structural unit is considered 
to be a macroblock of an encoding unit. The image 401 is 
10 partitioned into region 403a (filled region) and region 403b 

in accordance with the rule (order) indicated by dashed arrow 
404. Block 402 represents the region structural unit. 
[0048] 

Likewise, concerning image 405 of Fig . 6(b), all images 
15 are assumed to be encoded in the MB_AFF mode, and thus the 

region structural unit is considered to be a macroblock pair 
of an encoding unit . The image 405 is partitioned into region 
407a (filled region) and region 407b in accordance with the 
rule (order) indicated by dashed arrow 408. Block 406 
20 represents the region structural unit. 

[0049] 

In the video encoding method according to the present 
invention, as described above, where the frame mode and the 
MB_AFF mode are mixed, the encoding units corresponding to 
25 the respective modes are macroblocks and macroblock pairs. 

A macroblock pair is selected as the largest of them and 
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is defined as a region structural unit. Since the regions 
are defined on the basis of the macroblock pair, all the 
images will be partitioned as shown in the image 405 of Fig. 
6(b), independent of the encoding modes. 
5 [0050] 

Namely, the common region structural unit is determined 
instead of the encoding units, and regions of all the images 
are defined based thereon. This results in obtaining the 
same shape of regions by the partitioning according to the 

10 same rule, independent of the encodingmodes of the respective 

images, whereby consistency is maintained between regions 
of temporally adj acent images . Consequently, the method and 
apparatus according to the present invention reduce the 
obstruction to human perception caused by change in the shape 

15 of regions due to the difference of encoding modes. At the 

same time, the method and apparatus reduce the adverse effect 
on the efficiency of predictive coding. 
[0051] 
Second Embodiment 

20 Subsequently, the second embodiment of the present 

invention will be described with reference to Fig. 7 and 
Fig. 8. 

Fig. 7 is a bloclc diagram showing a configuration of 
a video decoding apparatus for implementing the video 
25 decodingmethod according to the present invention. As shown 

in Fig. 7, the video decoding apparatus 500 is provided with 
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input terminal 501^ decoder 502, output terminal 503, 
encoding mode controller 504, and data analyzer 505. The 
encoding mode controller 504 has a region specifying unit 
511. 
5 [0052] 

The operation of the video decoding apparatus 500 of 
the above configuration and each of the steps of the video 
decoding method implemented thereby will be described below. 
Compression-encoded data generated by the video 

10 encoding method in the first embodiment is fed through input 

terminal 501 (S601 in Fig. 8) . The compression-encoded data 
is analyzedby the data analyzer 505 to decode variable-length 
codes thereof, and thereafter header information is outputted 
to the encoding mode controller 504. The encoding mode 

15 controller 504 specifies the encoding mode of the 

compression-encoded data with reference to the encoding mode 
described in the header information (S602 in Fig. 8) . The 
encoding mode specified herein is one of the modes (1) to 
(6) described in the first embodiment. 

20 [0053] 

S603 is to derive the regions in encoding, based on 
the encoding mode thus specified and based on the 
region-generating rule described in the header information 
(the order indicated by dashed arrow 404 in Fig. 6 (a) ) . The 
25 process of deriving the encoding regions in the present step 

is much the same as the process of determining the encoding 
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regions described with reference to Fig. 5, and thus the 
illustration and detailed description thereof are omitted 
herein. 
[0054] 

5 The compression-encoded data in the regions derived 

at S603 is decoded in encoding units (S604) . Namely, the 
image data (DCT coefficients, motion information, etc.) 
outputted from the data analyzer 505 of Fig. 7 is fed into 
the decoder 502, and thereafter the data is subjected to 

10 inverse quantization in IQ (: Inverse Quantization) 506 on 

the basis of the encoding mode specified by the encoding 
mode controller 504. Thereafter, the dequantized data is 
subjected to inverse discrete cosine transform in IDCT {: 
Inverse Discrete Cosine Transform) 507, the image data is 

15 also subjected to motion compensation in MC (: Motion 

Compensation) 510, and thereafter the motion-compensated 
data is added to the predictive signal (508 in Fig. 7), thereby 
regenerating the image. 
[0055] 

20 Furthermore, the regenerated image is stored into frame 

memory 509, and is outputted at its display time via the 
output terminal 503 to a display device (not shown) . On the 
occasion of storing the regenerated image into the frame 
memory 509, the regenerated image is constructed using the 

25 data decoded at S604, in accordance with the encoding mode 

(S605) . 



27 



FP03-0159-00 



[0056] 

Namely, an image encoded in the frame mode is first 
reconstructed and thereafter is stored into the frame memory 
509 as it is. An image encoded in the field mode is first 
5 reconstructed, and thereafter is stored into the frame memory 

509 after merging of even scan lines and odd scan lines. 
An image encoded in the MB_AFF mode is first reconstructed 
and thereafter is stored in the form of macroblock pairs 
into the frame memory 509. 
10 Then the regenerated image constructed at S605 is 

outputted via the output terminal 503 to a display device 
(not shown) . 
[0057] 

In the video decoding method according to the present 
15 invention, as described above, the decoding is carried out 

on the situation in which the regions of each image are defined 
in the common region structural unit on the basis of the 
encoding mode. For this reason, the regions made by 
partitioning according to the same rule all are of the same 
20 shape, independent of the encoding modes of the respective 

images, so that consistency is maintained between regions 
of temporally adjacent images. Therefore, the method and 
apparatus according to the present invention reduce the 
obstruction to human perception caused by change in the shape 
25 of regions of regenerated images due to the difference of 

encoding modes. At the same time, the method and apparatus 
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reduce the adverse effect on the efficiency of predictive 

encoding. 

[0058] 

It is noted that the present invention is not limited 
5 to the contents described above in the first and second 

embodiments but can adopt appropriate modification 
embodiments without departing from the scope of the 
invention . For example, the above embodiments described the 
typical examples in which the encoding and decoding were 

10 carried out in the field mode while separating the even scan 

lines and odd scan lines of each image from each other, but 
the present invention is applicable to any separating method . 
For example, the present invention is also applicable to 
a case wherein the zeroth, fourth, eighth, and twelfth scan 

15 lines are separated out into a first subimage, the first, 

fifth, ninth, and thirteenth scan lines into a second 
subimage, the second, sixth, tenth, and fourteenth scan lines 
into a third subimage, and the third, seventh, eleventh, 
and fifteenth scan lines into a fourth subimage. In this 

20 case, supposing each subimage is encoded in macroblock units, 

it is necessary to define the region structural unit on the 
assumption that effective encoding units after merging of 
all the scan lines are sets of four macroblocks • 
[0059] 

25 Lastly, a program for implementing the video encoding 

method or the video decoding method according to the present 
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invention will be described with reference to Fig. 9. 

As shown in Fig. 9, video processing program 11 is stored 
in program storage area 10a formed in recording medium 10. 
The video processing program 11 can be executed by a computer, 
5 including a portable terminal, and has main module 12 

responsible for video processing, after-described video 
encoding program 13, and after-described video decoding 
program 14 . 
[0060] 

10 The video encoding program 13 is comprised of encoding 

mode determining module 13a, region structural unit 
determining module 13b, region defining module 13c, region 
information encoding module 13d, and compression-encoded 
data generating module 13e. The functions substantialized 

15 by operation of these modules are similar to the functions 

substantialized by execution of the respective steps of the 
aforementioned video encoding method. 
[0061] 

The video decoding program 14 is comprised of 
20 compression-encoded data input module 14a, encoding mode 

specifying module 14b, region structural unit determining 
module 14c, region defining module 14d, regenerated data 
generating module 14e, and regenerated image constructing 
module 14f . The functions substantialized by operation of 
25 these modules are similar to the functions substantialized 

by execution of the respective steps of the aforementioned 
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video decoding method. 
[0062] 

By recording the video processing program 11 in the 
recording medium 10, it becomes feasible to make a computer, 
5 including a portable terminal, readily execute the processing 

described in each of the above embodiments. More 
specifically, the video processing program 11 is stored in 
the program storage area 10a of a floppy disk having the 
physical format shown in Fig. 10(a), for example. Aplurality 
10 of concentric tracks T are formed from the periphery toward 

the center in the program storage area 10a, and each track 
T is segmented into sixteen sectors S in the circumferential 
direction. 
[0063] 

15 The program storage area 10a is housed in floppy disk 

casing C, as shown in Fig. 10(b), thereby forming a floppy 
disk as recording medium 10. When the recording medium 10 
is mounted in floppy disk drive 20 connected through a cable 
to well-known, commonly used computer system 30, as shown 

20 in Fig. 10 (c) , the video processing program 11 shown in Fig. 

9 becomes ready to be read out of the recording medium 10 

and is transferred to the computer system 30. 

[0064] 

The recording medium 10 is not necessarily limited to 
25 the floppy disk, but it can be any form as long as the program 

can be recorded therein; for example, it can be a hard disk. 
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an IC (Integrated Circuit) card, a ROM (Read Only Memory) , 

or the like. 

[0065] 

From the invention thus described, it will be obvious 
5 that the embodiments of the invention may be varied in many 

ways. Such variations are not to be regarded as a departure 
from the spirit and scope of the invention, and all such 
modifications as would be obvious to one skilled in the art 
are intended for inclusion within the scope of the following 
10 claims. 
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